Mind the Gap: Machine Translation by Minimizing the Semantic Gap in Embedding Space
نویسندگان
چکیده
The conventional statistical machine translation (SMT) methods perform the decoding process by compositing a set of the translation rules which are associated with high probabilities. However, the probabilities of the translation rules are calculated only according to the cooccurrence statistics in the bilingual corpus rather than the semantic meaning similarity. In this paper, we propose a Recursive Neural Network (RNN) based model that converts each translation rule into a compact real-valued vector in the semantic embedding space and performs the decoding process by minimizing the semantic gap between the source language string and its translation candidates at each state in a bottom-up structure. The RNN-based translation model is trained using a max-margin objective function. Extensive experiments on Chinese-to-English translation show that our RNN-based model can significantly improve the translation quality by up to 1.68 BLEU score. Introduction The conventional statistical machine translation (SMT) models, such as phrase-based models (Koehn et al. 2007), formal syntax-based models (Chiang 2007; Xiong, Liu, and Lin 2006) and linguistically syntax-based models (Liu, Liu, and Lin 2006; Huang, Knight, and Joshi 2006; Galley et al. 2006; Zhang et al. 2008), perform the decoding process and generate the translation result by compositing a set of translation rules which are associated with high probabilities. The probabilities of the translation rules (e.g. the phrasal translation probabilities and the lexical weights in phrase-based and formal syntax-based models) are all computed based on the cooccurrence statistics of the rule’s sourceand targetsides in the bilingual corpus. However, the cooccurrence statistics is much biased to the bilingual corpus and is not sufficient to show whether the sourceand target-sides in a translation rule are in the same meaning, especially for the low frequent but correct translation rules. Accordingly, the conventional SMT models cannot guarantee that the generated translations are in the most similar semantic meanings with the source-side inputs. Copyright c 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. developing 发展 the relations between 之间 的 关系 the two countries 两 国 the relations between the two countries 两 国 之间 的 关系 developing the relations between the two countries 发展 两 国 之间 的 关系
منابع مشابه
Bridging the semantic gap for software effort estimation by hierarchical feature selection techniques
Software project management is one of the significant activates in the software development process. Software Development Effort Estimation (SDEE) is a challenging task in the software project management. SDEE is an old activity in computer industry from 1940s and has been reviewed several times. A SDEE model is appropriate if it provides the accuracy and confidence simultaneously before softwa...
متن کاملOn Health Policy and Management (HPAM): Mind the Theory-Policy-Practice Gap
We argue that the field of Health Policy and Management (HPAM) ought to confront the gap between theory, policy, and practice. Although there are perennial efforts to reform healthcare systems, the conceptual barriers are considerable and reflect the theory-policy-practice gap. We highlight four dimensions of the gap: 1) the dominance of microeconomic thinking in health policy analysis and desi...
متن کاملA Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملHealth Policy and Management: In Praise of Political Science; Comment on “On Health Policy and Management (HPAM): Mind the Theory-Policy Practice Gap”
Health systems have entered a third era embracing whole systems thinking and posing complex policy and management challenges. Understanding how such systems work and agreeing what needs to be put in place to enable them to undergo effective and sustainable change are more pressing issues than ever for policy-makers. The theory-policy-practice-gap and its four dimensions, as articulated by Chini...
متن کاملA Survey on the Rate of Public Satisfaction about Subway Facilities in the City of Tehran Using Servqual Model
Tehran suburb city rail Exploitation Company (Tehran subway) presents public transportation services to more than 3 million people in a day. Therefore, the way these services are presented and customer satisfactions’ rate with the services presented to enjoy high importance. In the matters applied to the survey, first the expectations that users of this public transportation system have and the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014